Genomic Distances under Deletions and Insertions

نویسندگان

  • Mark Marron
  • Krister M. Swenson
  • Bernard M. E. Moret
چکیده

As more and more genomes are sequenced, evolutionary biologists are becoming increasingly interested in evolution at the level of whole genomes, in scenarios in which the genome evolves through insertions, deletions, and movements of genes along its chromosomes. In the mathematical model pioneered by Sankoff and others, a unichromosomal genome is represented by a signed permutation of a multiset of genes; Hannenhalli and Pevzner showed that the edit distance between two signed permutations of the same set can be computed in polynomial time when all operations are inversions. El-Mabrouk extended that result to allow deletions (or conversely, a limited form of insertions which forbids duplications). In this paper we extend El-Mabrouk’s work to handle duplications as well as insertions and present an alternate framework for computing (near) minimal edit sequences involving insertions, deletions, and inversions. We derive an error bound for our polynomial-time distance computation under various assumptions and present preliminary experimental results that suggest that performance in practice may be excellent, within a few percent of the actual distance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Insertion and Deletion Processes in Recent Human History

BACKGROUND Although insertions and deletions (indels) account for a sizable portion of genetic changes within and among species, they have received little attention because they are difficult to type, are alignment dependent and their underlying mutational process is poorly understood. A fundamental question in this respect is whether insertions and deletions are governed by similar or differen...

متن کامل

String editing under a combination of constraints

Let X and Y be any two strings of finite lengths N and M , respectively, over a finite alphabet. An edit distance between X and Y is defined as the minimum sum of elementary edit distances associated with edit operations of substitutions, deletions, and insertions needed to transform X to Y . In this paper, the problem of efficient computation of such a distance is considered under the assumpti...

متن کامل

The majority of recent short DNA insertions in the human genome are tandem duplications.

Nucleotide substitutions, insertions, and deletions constitute the principal molecular mechanisms generating genetic variation on small length scales. In contrast to substitutions, the nature of short DNA insertions and deletions (indels) is far less understood. With the recent availability of whole-genome multiple alignments between human and other primates, detailed investigations on indel ch...

متن کامل

The genome of Salmonella enterica serovar gallinarum: distinct insertions/deletions and rare rearrangements.

Salmonella enterica serovar Gallinarum is a fowl-adapted pathogen, causing typhoid fever in chickens. It has the same antigenic formula (1,9,12:--:--) as S. enterica serovar Pullorum, which is also adapted to fowl but causes pullorum disease (diarrhea). The close relatedness but distinct pathogeneses make this pair of fowl pathogens good models for studies of bacterial genomic evolution and the...

متن کامل

Mapping Insertions, Deletions and SNPs on Venter's Chromosomes

BACKGROUND The very recent availability of fully sequenced individual human genomes is a major revolution in biology which is certainly going to provide new insights into genetic diseases and genomic rearrangements. RESULTS We mapped the insertions, deletions and SNPs (single nucleotide polymorphisms) that are present in Craig Venter's genome, more precisely on chromosomes 17 to 22, and compa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Theor. Comput. Sci.

دوره 325  شماره 

صفحات  -

تاریخ انتشار 2003